perm filename GEM[0,BGB]3 blob
sn#078586 filedate 1973-12-23 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00014 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 3.0 GEOMETRIC MODELING THEORY.
C00003 00003 Introduction to Geometric Modeling.
C00004 00004
C00007 00005
C00009 00006 The geometric model has several roles:
C00011 00007 Geometric Modeling.
C00018 00008 Camera Model.
C00022 00009 The Vertex.
C00026 00010 The Edge.
C00029 00011 The Face.
C00032 00012 Polyhedra, Bodies and Objects.
C00036 00013 Three kinds of perimeter.
C00038 00014 BFEV Accessing.
C00042 ENDMK
C⊗;
3.0 GEOMETRIC MODELING THEORY.
Introduction to Geometric Modeling.
The word "geometry" literally means "earth measure" in Greek.
Before Euclid, geometry consisted of manual methods for measuring
things; so in a sense the kind of geometry in this chapter is
"pre-Euclidean", in that it is a theory composed of description
techniques, empirically found.
In the context of computer vision and artificial
intelligence,world modeling refers to representing and
simulating a world on a computer so that the course of world
events can be perceived, predicted and altered. Practical world
modeling issues involve which aspects of the world are significant
and worth modeling; which programming technologies should be used;
and what are the trade offs between storing the model and recomputing
the model. With respect to the last issue, the term "representation"
indicates a concern with data structure and time independent features
while the term "simulation" indicates concern with process and dynamic
situations.
In a more abstract definition, the world model is the third
of four nested universal Turing machines. The outermost Turing
machine is called the real world; the second Turing Machine is
simulated on the world machine and may be called the robot; the third
Turing machine is simulated on the robot's machine and is called the
robot's world model. The fourth Turing machine is the robot's self
model, at which point the recursion can be stopped for the present
discussion. Although the nesting allows for self perception and self
manipulation, we will assume for the following that the robot's
hardware can be isolated from the real world so that it is just metal
box.
The world is something that has states that can be changed.
From the inside of an intellectual entity it appears as if
the overwhealming majority of world states change spontaneously,
(or by means of the mind of God, if you are a Berkelian theist);
and the necessarily few states are controled.
Kinds of world modeling in Artificial Intelligence.
Geometric, semantic, logical, procedural
---------------------------------------------------------
| Five kinds of geometric models: |
| |
| 1. 3-D Space Array. |
| 2. 2-D Surface Function Z=F(x,y). |
| 3. Manifolds: Body, Face, Edge, Vertex. |
| 4. Spine Cross Section. |
| 5. Cellular. |
---------------------------------------------------------
The geometric model has several roles:
1. Synthesis source for verification.
2. Analysis target for revelation.
3. Domain,
for problem solving,
for spatial visualization,
for semantic definition (noun concepts),
for recognition feature classification,
for planning.
The theory;
Assumption: The vision world model should be three dimensional.
The vision world model should be able to represent
surfaces of opaque objects.
In order to get a computer to deal with the physical world it
must have a data representation on which computations involving
space, time, shape, size, and the appearance of things can be done.
Geometric Modeling.
I will introduce my requirements for a computer model of the
physical world in terms of its role as a memory. As a memory, a
world model has contents and an addressing mechanism. The kinds of
data that I wish to hold in my world model are:
CONTENT REQUIREMENTS
1. Topological data.
2. Geometric data.
3. Photometric data.
4. Parts tree data.
Topological data has to do with the notion of neighborhood; a
world model has data on what is next to what. A face, edge, vertex
model is essentially dedicated to surface topology; matters of volume
topology are not stored but rather must be computed. Geometric data
has to do with notions such as locus, length, area and volume.
Photometric data includes the locus and nature of light sources, as
well as data on how surfaces reflect, absorb and scatter light. Parts
tree data has to do with the notion that objects are composed of
parts, which I construe as data on the structure of the physical
world rather than as purely an artifact of having structured world
data; that is, I prefer to have the specification of how an entity is
broken into parts be external to my world model. The kinds of data
not included are semantic data (other than body names); physical data
such as mass, inertia tensors, electrical properties and so on; and
cultural data such as whether an object is a toy, tool, or weapon;
with any artistic, religious or market value.
Next the kinds of addressing mechanisms I wish to have, (or
equivalently the input-output modes of the model) are:
ACCESSING REQUIREMENTS
1. Appearance - given a camera, return an image of
what the world would look like from that camera.
2. Recognition - given an image, return the objects
from the world model that appear in that image.
3. Camera Solution - given a recognized image,
find the location & orientation of the camera.
4. Perception - given images, from solved cameras,
place new bodies into the model for portions of
the images that have not yet been recognized.
5. Spatial Accessing - given a locus and radius,
return the portions of objects in that sphere.
Clearly, these are the high level accessing requirements which are
the reasons for having a world model and the design goals for model
building.
Camera Model.
As the accessing requirements imply, a world model requires a
special entity called a camera which is used to model image
formation. Although the camera model is important here for a complete
specification of the data, it may be skipped on a first reading. The
particular camera model I have been using lately, is expressed by
eighteen real numbers involving nine degrees of freedom. First there
is the camera lens center locus:
CX, CY, CZ, in world coordinates.
Afixed to the lens center is the camera frame of reference with unit
vectors i, j and k. When the image formed by the camera is placed in
correspondence to a display screen, as illustrated in figure 1.3, the
unit vector i maps into the rightward positive x of the display
screen, and the unit vector j maps into the upward positive y of the
display screen, and the unit vector k comes out of the display screen
to form a right handed coordinate system. Together the three unit
vectors of the camera are the three by three rotation matrix:
IX, IY, IZ
JX, JY, JZ in world coordinates.
KX, KY, KZ
Next, there are three scales which determine the correspondence
between world size and image size. Observe that the world is measured
in physical units of length like meters or feet while computer images
come in integral sizes like 1024 by 1024 or 480 rows by 512 columns,
thus the conversion scales must be in terms of logical image units
per physical world units. In an actual television camera a minute
image (say 9mm by 12mm) is formed on a vidicon tube and that image
has a particular number of rows and columns. It is the little image
on the vidicon that we pretend to model by the six parameters:
LDX, LDY, LDZ Logical raster size.
PDX, PDY, FOCAL Physical raster size.
Where the number named FOCAL, is the focal plane distance which in
this model (with distant objects) can safely be equated with the lens
focal length and can be given in millimeters (conventional lens run
12.5mm to 75mm for 1" TV). The integer LDZ is an artifact so that
the units come out correctly in the Z dimension. Thus the scales
factors are defined:
SCALEX ← -FOCAL*LDX/PDX;
SCALEY ← -FOCAL*LDY/PDY;
SCALEZ ← FOCAL*LDZ;
This simple camera model is used to compute vertex image
coordinates. A more elaborate physical camera model can be found in
Sobel [reference 9].
The Vertex.
A vertex is a labeled point in a Euclidean three
space. The important thing about a vertex is its world locus (with
component names XWC,YWC,ZWC standing for world-coordinates). Vertex
locii are the basic geometric data from which length, area, volume,
face vectors and image positions can be computed. Also a Vertex may
exist simultaneously in one or more image spaces. An image space
(with component names XPP,YPP,ZPP standing for perspective-projected)
is always three dimensional and is determined with respect to a given
camera centered coordinate system (with component names XCC,YCC,ZCC
standing for camera-coordinates). The third image component, ZPP,
is taken inversely proportional to the distance of the vertex from
the camera image plane, ZCC. Using the camera of the previous
section. The transformation of a vertex world locus to a camera
centered locus is:
X ← XWC - CX;
Y ← YWC - CY;
Z ← ZWC - CZ;
XCC ← IX*X + IY*Y + IZ*Z;
YCC ← JX*X + JY*Y + JZ*Z;
ZCC ← KX*X + KY*Y + KZ*Z;
The first three assignment statements are the translation to
the camera frame's origin, the second three assignments are the
rotation to the camera frame's orientation. Next the perspective
projection transformation is computed:
XPP ← SCALEX*XCC/ZCC;
YPP ← SCALEY*YCC/ZCC;
ZPP ← SCALEZ /ZCC;
The XPP and YPP assignments are derived by means of similar
triangles, as is being done by the man in figure 1.5; the Zpp
assignment is for preserving the depth information and the
colinearity of the world in the perspective projected image space. As
given, the PP frame is right handed and vertices in front of the
camera's image plane will have negative Zpp; Zpp values near -FOCAL
are close to the camera and values approaching zero are far away.
A final matter with respect to vertices is their valence. The
valence of a vertex is the number of edges that meet at the vertex. A
vertex valence of three, for example, indicates a trihedral corner.
The Edge.
For a start, the structure of an edge need be thought of as
little more than two vertices; the topological subtlety of edges will
be explained later. However, two vertices do define the important
geometric edge data called the 2D line coefficients. Named AA, BB,
CC; these coefficients are computed from the perspective locus of the
edge's endpoints as follows:
AA ← Y1 - Y2;
BB ← X2 - X1;
CC ← X1*Y2 - X2*Y1;
These coefficients appear in the 2D equation of the line that
contains the edge:
0 = AA*X + BB*Y + CC;
When the edge coefficients are normalized:
L ← SQRT(AA↑2+BB↑2);
AA ← AA/L;
BB ← BB/L;
CC ← CC/L;
the line equation gives the distance, of a point X,Y from the line:
Q ← AA*X + BB*Y + CC;
The distance is actually ABS(Q), since Q is negative on one side side
of the line; also if one were standing on the plane at point X1,Y1
facing X2,Y2 the Q positive half-plane would be on your left and the
Q negative half plane would be on your right.
An important operation on two edges is to detect whether or
not they intersect; this can be decided by checking first whether the
endpoints of one edge are in the opposite half planes of the other
edge, and second whether the endpoints of the latter edge are in the
opposite half planes of the first. When both conditions obtain, then
the intersection point can be found:
T ← (A1*B2 - A2*B1);
X ← (B1*C2 - B2*C1)/T;
Y ← (A2*C1 - A1*C2)/T;
An actual compare for intersection should initially check for the
identity case, and for edges with a vertex in common.
The Face.
A face is a finite region of plane enclosed by straight
lines. A safe formal face structure could be built by defining a
triangle as three non-colinear vertices and then insisting that all
faces be triangle interiors. Unhappily, BFEV faces are usually
represented as a list of vertices and edges (or by something nearly
equivalent) for the sake of saving memory space. Such `list' faces
are not monolithic but tend to suffer special cases and pathologies
such as:
Coincident or crossing edges,
Holes and Disjointness,
Concavity (& Convexity),
Non-coplanarity.
Like edges, faces have characteristic coefficients. Face coefficients
satisfy the equation of a plane in which the face is embedded:
AA*X + BB*Y + CC*ZZ = KK.
The equation could be divided by KK, but that is undesirable because
the AA, BB, CC are more useful as a unit normal vector, in which case
KK is the distance of the origin from the plane. Given the locii of
three non-colinear vertices, the coefficients of a plane can be
computed by Kramer's rule as follows:
KK ← X1*(Z2*Y3-Y2*Z3)
+ Y1*(X2*Z3-Z2*X3)
+ Z1*(Y2*X3-X2*Y3);
AA ← (Z1*(Y2-Y3) + Z2*(Y3-Y1) + Z3*(Y1-Y2));
BB ← (X1*(Z2-Z3) + X2*(Z3-Z1) + X3*(Z1-Z2));
CC ← (X1*(Y3-Y2) + X2*(Y1-Y3) + X3*(Y2-Y1));
and normalized:
ABC ← SQRT(AA↑2 + BB↑2 + CC↑2);
AA ← AA/ABC;
BB ← BB/ABC;
CC ← CC/ABC;
KK ← KK/ABC;
If the given vertices V1, V2, V3 had been taken going counter
clockwise about the face as viewed from the exterior of the solid,
then the following relations obtain:
AA*X + BB*Y + CC*Z > KK implies X,Y,Z above the plane.
AA*X + BB*Y + CC*Z = KK implies X,Y,Z in the plane.
AA*X + BB*Y + CC*Z < KK implies X,Y,Z below the plane.
Face coefficients prove useful in both world and image coordinates.
Polyhedra, Bodies and Objects.
In elementary geometry, a polyhedron is said to be a solid
formed (or bounded) by plane faces; the word "polyhedron" literally
meaning "many-faced". Topologically, simple polyhedra satisfy
Euler's F-E+V=2 equation; where F, E and V are the number of faces,
edges and vertices of the polyhedron respectively. This equation was
known to Descartes in 1640, but the first proof wasn't given until
1752, when Euler proved the relation by considering the graph
corresponding to the edges of polyhedra. A simple polyhedron is one
homeomorphic to a sphere. The rigorous development of volume measure,
and in turn `solid' polyhedra, is not simple; thus it has been easier
to take the topological notion F-E+V=2 as the more primitive
definition of a polyhedron on which to base a data structure and to
proceed towards the appearance of `solidness' which is a more
complicated notion.
Counter to the usual usuage, I define the word "body" to mean
an entity more specific than a polyhedron; the idea being that a
polyhedron is represented by the whole structure of bodies, faces,
edges and vertices. Bodies may have location, orientation and volume
in space. Bodies may be conected to faces, edges and vertices, which
may or may not form a complete polyhedron. It is typical to have
only one body to a polyhedron when representing a rigid object like a
sledge hammer and several bodies to a polyhedron when representing a
flexible object like a man. Furthermore, the body concept is used to
handle the notion of parts and abstract regional objects such as a
parking lot. For example, the Stanford AI Parking Lot is
represented by a body that has three parts: the Near, Mid and Far
Lots. The Near Lot then has aisles and lanes and lamp islands; a lamp
island has a curb and two lamps; a lamp has a base, stem and top.
This parts structure is carried in body nodes. Finally, the word
"object" will be used to refer to physical objects such as a
redwood-tree, building, or roadway.
Three kinds of perimeter.
Figure 1.6
FACE PERIMETER - a face is surrounded by edges and vertices.
VERTEX
⊗
/ \
/ \
/ \
/ \
EDGE / \ EDGE
/ \
/ FACE \
/ \
/ \
/ \
VERTEX ⊗---------------------⊗ VERTEX
EDGE
Figure 1.7
VERTEX PERIMETER - a vertex is surrounded by edges and faces.
EDGE
|
|
FACE | FACE
|
⊗ VERTEX
/ \
/ \
/ \
/ \
/ FACE \
EDGE EDGE
Figure 1.8
EDGE PERIMETER - an edge is surrounded by 2 faces & 2 vertices.
VERTEX
⊗
|
|
FACE E FACE
|
|
⊗
VERTEX
BFEV Accessing.
1. Accessing by name and serial number.
2. Parts-Tree Accessing.
3. FEV Sequential Accessing.
4. FEV Perimeter Accessing.
A BFEV model has four kinds of accessing. The most
conventional BFEV access is retrieval by symbolic name which requires
a symbol table. Next, between bodies there is Parts-Tree accessing.
At the top of the Parts-Tree is a special body named the world to
which all the other bodies are attached; thus the world body serves
as an OBLIST node. Given a particular body, a list of its sub-parts
can be retrieved as well as its supra-part or "supart". A supart is
the whole entity to which a part belongs, the world being its own
supart.
Within each body there is face, edge and vertex sequential
accessing. Given a body, all its faces, or edges, or vertices need to
be readily available since perspective projection loops thru all the
vertices, and the process of display clipping loops thru all the
edges, and the act of checking for body intersection loops thru all
the faces. In LISP, one might provide FEV sequential accessing by
placing a list of faces, a list of edges and a list of vertices on
the property list of each body, so that a cube might be represented
as:
(DEFPROP CUBE (F1 F2 F3 F4 F5 F6) FACES)
(DEFPROP CUBE (E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12)EDGES)
(DEFPROP CUBE (V1 V2 V3 V4 V5 V6 V7 V8) VERTICES)
Finally, among the faces, edges and vertices of a body there
is perimeter accessing. Faces have a perimeter of edges and vertices
[figure 1.6]; less commonly used, vertices have a perimeter of edges
and faces [figure 1.7]; and of particular note, edges have a
perimeter always formed by two faces and two vertices, [figure 1.8].
Perimeter accessing requires that given a face, edge or vertex, that
the perimeter of that entity be readily accessible. Since the surface
of a polyhedron is orientable, that is has a well defined inside and
outside, (Klein bottles with their crosscaps will not be modeled),
such perimeter lists can be ordered (say clockwise) with respect to
the exterior of the polyhedron. Perimeter accessing is mentioned in
Guzman [reference 6] and Falk [reference 4] and is the underlying
basis of part-II of this paper which presents a polyhedron model
built for accessing and altering face, edge and vertex perimeters.